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. . . Abstract 
00 

This paper considers multiaccess multiple-input multiple-output (MIMO) systems with finite rate feedback. 
The goal is to understand how to efficiently employ the given finite feedback resource to maximize the sum rate by 
characterizing the performance analytically. Towards this, we propose a joint quantization and feedback strategy: 
the base station selects the strongest users, jointly quantizes their strongest eigen-channel vectors and broadcasts a 
common feedback to all the users. This joint strategy is different from an individual strategy, in which quantization 
and feedback are performed across users independently, and it improves upon the individual strategy in the same 
way that vector quantization improves upon scalar quantization. In our proposed strategy, the effect of user selection 
is analyzed by extreme order statistics, while the effect of joint quantization is quantified by what we term "the 
composite Grassmann manifold". The achievable sum rate is then estimated by random matrix theory. Due to its 
simple implementation and solid performance analysis, the proposed scheme provides a benchmark for multiaccess 
MIMO systems with finite rate feedback. 

i O i 

I. Introduction 

(N - 

> ■ This paper considers multiaccess systems, corresponding to the uplink of cellular systems, where both 
^J 1 ■ the base station and the multiple users are equipped with multiple antennas. Multiple antenna systems, 
■^j- ! also known as multiple-input multiple-output (MIMO) systems, provide significant benefit over single 
O ' antenna systems in terms of increased spectral efficiency and/or reliability. The full potential of MIMO 
Tj- ' though requires perfect channel state information (CSI) at both the transmitter and the receiver. While it is 
^ ■ often reasonable to assume that the receiver has perfect CSI through a pilot signal, assuming perfect CSI 
O ', at the transmitter (CSIT) is typically unrealistic. In many practical systems, the transmitter obtains CSI 
; through a finite rate feedback from the receiver. Note that a wireless fading channel may have infinitely 
many channel states, and a finite rate feedback implies that CSIT is imperfect. One expects a performance 
degradation, and here we focus on the quantitative effect of finite rate feedback and the corresponding 
5^ \ design. 

Insight from single user MIMO systems with finite rate feedback proves beneficial. Single user systems 
are similar to multiaccess systems in the sense that there is only one receiver in both systems. The receiver 
knows the channel states perfectly and helps transmitters adapt their signals to maximize throughput. The 
essential difference between these two types of systems lies in the modes of antenna cooperation. In 
single user MIMO systems, all the transmit antennas are able to cooperate in sending a given message. In 
multiaccess systems, different users have independent messages, and transmit antennas belonging to one 
user cannot aid the transmission of another user's message. Due to this additional constraint, the analysis 
and design of multiaccess systems becomes more complicated. Still, we will borrow insight from single 
user systems to simplify the design of multiaccess systems. For single user MIMO systems, strategies to 
maximize throughput with perfect CSIT and without CSIT are derived and analyzed in [1]. When only 
finite rate feedback is available, the focus has moved toward the development of suboptimal strategies 
as a simplification. The dominant approach is based on power on/off strategy, in which a data stream is 
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either turned on with a pre-determined constant power or turned off (zero power). Systems with only one 
stream are considered in [2]-[4]. Systems with multiple independent streams are investigated in [5]— [1 1]. 
It appears that power on/off strategy is near optimal compared to the optimal power water-filling allocation 
[10]. 

We aim to understand how to efficiently employ the given finite feedback resource to maximize the 
sum rate by characterizing performance analytically. The full multiaccess MIMO problem still appears 
behind reach mathematically and is left for the future. In this paper, we propose a suboptimal strategy 
by borrowing insight and methods from single user systems. Specifically, the base station selects the 
strongest users, jointly quantizes their strongest eigen-channel vectors and broadcasts a common feedback 
to all the users. Instead of designing a specific quantization code book, we show that the performance of 
a random code book is optimal in probability. After receiving feedback information, a selected on-user 
employ power on/off strategy and transmit along the beamforming vector selected by the feedback. Here, 
joint quantization and feedback are employed based on the plain fact that vector quantization is better than 
scalar quantization [12, Ch. 13]. (The precise gain will be verified empirically.) It is also worth noting 
that, as we shall discuss in Section [IV] and [V] antenna selection can be viewed as a simplified version of 
the proposed scheme. 

This approach differs from the ongoing research for broadcast channels (BC) with finite rate feedback. 
While there is a well known duality between broadcast and multiaccess systems [13], this duality requires 
full CSI at both the transmitters and the receivers and is not available when only partial CSIT is provided. 
When CSIT is available only through finite rate feedback, broadcast systems suffer from the so called 
interference domination phenomenon [14], [15]. The major effort in research is to limit the interference 
among users. Sharif and Hassibi select the near orthogonal channels when the number of users is 
sufficiently large [14], [15]. As the number of users is comparable to the number of antennas at the 
base station, Jindal shows that the feedback rate should be proportional to signal-to-noise ratio (SNR) if 
the number of users turned on is fixed [15], while we show that the number of users should be adapt to 
the SNR if the feedback rate is given [16]. However, the interference domination phenomenon does not 
appear in multiaccess systems. Note that the search of near orthogonal channels suffers from exponential 
increasing complexity. Neither the results nor the methods for broadcast systems can be directly applied 
to the problem discussed in this paper. 

Though the strategy in this paper is relatively simple, the corresponding performance analysis is 
nontrivial. Our main analytical result is an upper bound on the sum rate, which to our knowledge is 
the best to date. The effect of user/antenna selection is analyzed by extreme order statistics, and the 
effect of eigen-channel vectors joint quantization is quantified via the composite Grassmann manifold. 
Interestingly, the complicated effect of imperfect CSIT and feedback is eventually described by a single 
constant, which we term the power efficiency factor. Successful evaluation of the power efficiency factor 
enables us characterize the upper bound on the sum rate. The anticipated goodness of the upper bound is 
supported by simulation of several systems with a large range of SNRs. 

The rest of this paper is organized as follows. The general model for multiaccess systems with finite 
rate feedback is described in Section [III The mathematical results developed for performance analysis are 
assembled in Section [nil The antenna selection strategy is analyzed in Section IIV-AI Then a suboptimal 
strategy is proposed and analyzed in Section HV-Bl In Section [V] simulation results are presented and 
discussed. Finally, Section [VI] summarizes the paper. 

II. System Model 

Assume that there are L R antennas at the base station and N users communicating with the base station. 
Assume that the user {] has Lx,i transmit antennas 1 < i < N. Throughout we will set Lt,i = ■ ■ ■ = 

'When a user joins the multiaccess system, a unique index is assigned and keeps constant. A user in a multiaccess system is aware of 
the corresponding index. 
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L T n — L T . The signal transmission model is 



A' 



Y = ^H i T i + W, 



i=l 



where Y G C LrX1 is the received signal at the base station, H, G C LrxLt is the channel state matrix for 
user i, Tj G C Ltx1 is the transmitted Gaussian signal vector for user i and W G C Lrx1 is the additive 
Gaussian noise vector with zero mean and covariance matrix Il r . We assume the Rayleigh fading channel 
model: the entries of H/s are independent and identically distributed (i.i.d.) circularly symmetric complex 
Gaussian variables with zero mean and unit variance (CM (0, 1)), and H/s are independent across i. 

We further assume that there exists a feedback link from the base station to the users. At the beginning 
of each channel use, the channel states Hj's are perfectly estimated at the receiver (the base station). 
This assumption is valid in practice since most communication standards allow the receiver to learn the 
channel states from pilot signals. A common message, which is a function of the channel states, is sent 
back to all users through the feedback link. We assume that the feedback link is rate limited and error- 
free. The feedback directs the users to choose their Gaussian signal covariance matrices. In a multiaccess 
communication system, different users cannot cooperate in terms of information message, leading to 

for % ^ j. Let T = T} ■ • • Tjy be the overall transmitted Gaussian signal for all users 



E 



rp.rpt 



and S = E [TT^] be the overall signal covariance matrix. Then £ is an NLt x NLt block diagonal 



T,;T 



Let H = [HxH 2 H 



N\ 



matrix whose i th diagonal block is the L T x L T covariance matrix E 
be the overall channel state matrix. An extension of [17] shows that the optimal feedback strategy is to 
feedback the index of an appropriate covariance matrix, which is a function of current channel state H. 
Last, assume that there is a covariance matrix codebook By, = {£1, • • • ,^k b } (with finite size) declared 
to both the base station and the users, where each G By, is the overall signal covariance matrix with 
block diagonal structure just described, and Kb is the size of the codebook. The feedback function ip 
is a map from {He c LfiX7VLT } onto the index set {1, • • • ,K B }. Subjected to this finite rate feedback 
constraint 

\Bs\ = K b 

and the average total transmission power constraint 

E H [tr(E„ (H ))] < p, 
the sum rate of the optimal feedback strategy is given by 

sup sup E H [log \1 Lr + HE^tfl] . (1) 

Here, since only symmetric systems are concerned, the total power constraint p is equivalent to individual 
power constraint p/N. Note that the optimal strategy involves two coupled optimization problems. It is 
difficult, if not impossible, to find its explicit form and performance. Instead, we shall study two suboptimal 
strategies and characterize their sum rates in Section [TVJ 

III. Preliminaries 

This section assembles mathematical results required for later analysis. The reader may proceed directly 
to Section [IV] for the main engineering results. 
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A. Order Statistics for Chi-Square Random Variables 

Define Xi = J2j=i \hi,j\ 2 where hij 1 < j < L, 1 < i < n are i.i.d. CAf (0, 1). Then each Xi has a 
Chi-square distribution with probability density functions (PDF) 

1 



fx(x) 



(L-iy. 



Denote the corresponding cumulative distribution function (CDF) by F x (x). Next introduce the order 
statistics for these variables: that is the non-decreasing list Xri :n \ < X^-.n) < • • • < X 



(n:n) 



connected 



with each realization. Here, the subscript (k : n) indicates that X^- n ) is the k minima. (We follow the 
convention of [18].) Note of course that ties occur with probability zero and can be broken arbitrarily. 
We will need the following, which is proved in Appendix |Bj 
Lemma 1: With the notation set out above, for any fixed positive integer s it holds 



lim E 

n— »+oo 



(2) 



where 



a n = inf < x : 1 — F x (x) < 



l~>i=0 i! 



n 



X^L-l l 



and Hi = f_°° xde e x may be computed numerically. 

The limiting result in expectation immediately provides the following approximation for a fixed s: 



E 



(n—k+l:n 



sa r 



k=l 



Sb r 



(3) 



The shape of F x guarantees that a n and so b n are finite for any fixed n but tend to infinity and one 
respectively with this parameter. 



B. Conditioned Eigenvalues of the Wishart Matrix 

Let H G L nxm be a random nx m matrix whose entries are i.i.d. Gaussian random variables with zero 
mean and unit variance, where L is either R or C. Throughout, we refer to H as the standard Gaussian 
random matrix. Let Ai > A 2 > ■ • • > A n be the ordered eigenvalues of W = HH^ (W is Wishart 
distributed [19]). 

This subsection takes up an estimate of E [ Ai| tr (W)], where tr (•) is the usual matrix trace. In particular, 
while a closed formula for this object would be rather involved, we may use random matrix theory to 
obtain an approximation. The first ingredient is the following. 

Lemma 2: Let H £ L" xm (w.l.o.g. n < mjl be a standard random Gaussian matrix. Let Ai > A2 > 
■ ■ • > A n be the ordered eigenvalues of W = HH^. Then 

E[Ai|tr (W) = c] = CiC, 

where 

Ci = E[Ai|tr(W) = l]. (4) 

(5) 

2 If n > m, E [Ai|tr (HH f ) = c] = for i > m and E [A< |tr (HH f ) = c] = C'c for i < m, where (■ ■= z E [ A »l tr ( HtH ) = c ] ■ The 
calculation of ^ for i < m is included in Theorem [2] as well. 
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P = 1 if L = R or = 2 if L = C, and |A n (A)| = U^j - 

Proof: The joint density of the ordered eigenvalues of W is known to be 



I (n-m+l)-l 



An (A) 



8=1 



where Ai > • • • > A m > 0, |A n (A)| = U^j - 

(3- 



1 if L = R 

2 if L = C 



and K m<n $ is a normalizing factor ( [19, pg. 107] for the real case and [1] for the complex case). Write 
out the formula for E[Aj| Y^a= i ^ = c ] an ^ use me variable change A^ = ^. After some elementary 
calculations, it can be shown that 



/ E a, =1 A.n^Aj^^lA^A)!^^^ 



/E^ ECU A/ (m -^ 1) - 1 1 A„ (A)^ dA, 



E[Ai|tr (W) = l]. 



Given the preceding observation, we require an estimate for d in ©. For this we turn to the asymptotic 
behavior of the ordered eigenvalues. 

Lemma 3: Let Ai > A2 > ■ • • > A n be the ordered eigenvalues of ^Htf, where H G L nxm 
(L = R or C) is a standard random Gaussian matrix. As n,m ^ 00 with — — > m G R + , for a given 

r G (0, min (1, m)), 



lim E 

(n,m)— -*oo 



iKKut 



m 
27 



1 + 4 - a 



7T 



(1 + 4 -a) 



where A + = + \J^j , A~ = — \J^j > and a G (A - , A + ) satisfies 



m 
2^ 



a) (a — A ) + — I — + sin 1 
m \ 2 



1 + m [ vr , . _! \/m (1 + i - a) 



a) (a — A ) + 



+ sin 
2 2 

2 s 



( * _ sin -i ^ (j+Ahziizi) 

m I 2 2 a 



This lemma is an extension of Theorem [6] in Appendix |A] with explicit evaluation of the integrals 
appearing in that statement. 

Motivated by the observation that the expectation of a fixed fraction of the ordered eigenvalues converges 
to its limit quickly [10], we approximate d by (1 for fixed finite n and m. 



C. The Grassmann Manifold and the Composite Grassmann Manifold 

As demonstrated in [9], [10], the Grassmann manifold is closely related to eigen-channel vector 
quantization, and here we introduce the composite Grassmann manifold. The results developed here help 
quantify the effect of eigen-channel vector quantization in multiaccess systems (see Section IIV-BI for 
details). 
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The Grassmann manifold Q n>p (L) is the set of all p-dimensional planes (through the origin) in the n- 
dimensional Euclidean space L n , where L is either K or C. A generator matrix P £ L nxp for a plane P £ 
Q n>p (L) is a matrix whose columns are orthonormal and span P. For a given P £ Q n>p (L), its generator 
matrix is not unique: if P generates P then PU also generates P for any pxp orthogonal/unitary matrix U 
(with respect to L = K./C respectively) [20]. The chordal distance between two planes Pi, P 2 £ £n, P (L) 
can be defined by their generator matrices Pi and P 2 via 



d c (P U P 2 ) = J P -tX (PlP^Pi). 



The isotropic measure // on ^ n p (L) is the Haar measure on Q n # (LjE Let O (n) and C/ (n) be the sets 
of tixti orthogonal and unitary matrices respectively. Let A £ O (n) when L = R, or A £ U (n) when 
L = C. For any measurable set .M C ^„ iP (L) and arbitrary A, fi satisfies 

Li(AM) = n(M). 

Given above definitions, the distortion rate tradeoff on the Grassmann manifold is quantified in [11], 
[22]. A quantization q is a mapping from Q njP (L) to a discrete subset of Q ntP (L), which is typically 
called a code C. For the sake of application, the quantization 

q : Q n , p (L) -> C 

Q h-> q (Q) = arg mm 4 (P, Q) 

is of particular interest. Define the distortion metric on Q n p (L) as the squared chordal distance. Let 
Q £ {?n, P (L) be isotropically distributed (the probability measure is the isotropic measure). For a given 
code C, the distortion associated with this codebook is defined as 



D (C) = E Q 



min d\ (P, Q) 

Pec 



For a given code size K where K is a positive integer, the distortion rate function is 

D* (K) = inf D (C) . 

C:\C\=K 

In [11], [22], we quantify the distortion rate function by constructing tight lower and upper bounds. The 
results are summarized as follows. 

Lemma 4: Consider the distortion rate function on Q rhV (L). Let t = f3p(n — p), 



P 



1 ifL = R 

2 ifL = C 



p r(|(n-i+l)) 

r( i+ij 1 l<=i r(|(p-i+i)) 11 p - 2 



1 Tir f T-if-^- otherwise 



r(|+i) lli=1 r(f(n-p-t+i)) 



2 



When is sufficiently large (c n p p/3 2 < < 1 necessarily), 

t _ 2 2 logo if 

^ C n,^2-^(1 + 0(1))<P*(K) 

</yCf^" (1 + ^(1)). (6) 

3 The Haar measure is well defined for locally compact topological groups [19], [21], and therefore for the Grassmann manifold, the 
composite Grassmann manifold and the composite Grassmann matrices. Here, the group right and left operations are clear from context. 
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To analyze the joint quantization problem arising in multiaccess MIMO systems (see Section IIV-BI 
for details), we introduce the composite Grassmann manifold. The m-composite Grassmann manifold 
Q^p (L) is a Cartesian product of m many Q n<p (L)'s. Denote P^ an element in Qn^ p (L). Then P*" 1 ) 
can be written as (Pi,-- - ,P m ) where Pi 6 Q n . p (L) 1 < i < m. For any P^ , Pg"^ e £?n™ (L), the 
chordal distance between them is well defined by 



d c (P[ 



(m) p (m) 



\i=i 

where P x (m) = (P u , • • • ,P 1>m ), P 2 M = (P 2)1 , • • • , P 2) m ) and P hJ e Q n , p (L) (i = 1,2 and j = 
1, 2, • • • , m). The isotropic measure on Q^ p (L) can be induced from the isotropic measure on Q np (L): 
it is just the product of the isotropic measures on the composed copies of Q n p (L). 

One goal will be to characterize the distortion rate function on Qn$ (L). By analogy with the above 
discussion let a code C be any discrete subset of Qn$ (L), and consider the quantization function 

q (QM) = arg min d c (P (m) , Q (m) ) . (7) 

p(m) eC 

Let the distortion metric on Q^ p (L) be the squared chordal distance. The distortion associated with C is 
given by 

D (C) = Eg(m) 



min d 2 c (P {m) ,Q {m) ) 

p(m)g C 



where Q^ m - ) € </n™ (L) is isotropically distributed. For all K G Z + , the distortion rate function is defined 
as 

D* (K) = inf D (C) . 

C:\C\=K 
,(m) 



The following theorem characterizes D* (K) on Q)l n p (L). 

■tP ^+J - ^ el - ^' Cn.p.p.f3 



Theorem 1: Consider the distortion rate function on Q^ p (L). Let t, c nvv n and (3 be defined as in 



Lemma HI When K is sufficiently large ( rm ^( mf+1 ) c * o m* < 1 necessarily), 

n(t+i) n 'P'P'i J 

mt I \ ~ (in-, ■ 1 ) -~ -! . , \ , 

' T7T, .A ,^ 2 '^ I (! + ° (1)) < D* (*) 



mt + 2y rf (| + i) 

2 f 2 \ ( T^t (m| + l) _a 2io g2 y \ 

5 ^ r U) ( rt (| + i) c ^"j (1 + ' (8) 

The detailed proof is given in Appendix O but we mention here that the upper bound is derived 
by calculating the average distortion of random codes, which turn out to be asymptotically optimal in 
probability. Further, the lower and upper bounds differ only in the constant factors: f° r me lower 
bound and (-M for the upper bound. As n, K — > oo with loS2 — > r, this discrepency vanishes and 
we precisely characterize the asymptotic distortion rate function. 

Theorem 2: Fix p and m. Let n, K — ► oo with log ^ — > r. If r is sufficiently large (mp2~^™p r < 1 
necessarily), then 

2 

lim D* (K) = mp2 P m p r ', 

where /3 = 1 if L = R, and /3 = 2 if L = C. Furthermore, let C ran d C </n™ (L) be a code random 
generated from the isotropic distribution and with size K. Then for Ve > 0, 

2 



lim Pr ( D (C rand ) > mp2 ?™p r + e ) =0. 
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The proof of this theorem follows from those in [11, Theorem 3] and is omitted here. 
This theorem provides a formula for the distortion rate function at finite n and K: 



2 / 2 \ /r^t (m| + 1) 

™* W/ \ rf (| + i) 



By the asymptotic optimality of random codes, we have employed random codes for our analysis, and 
approximate the corresponding distortion rate function by ignoring the higher order terms behind ©. 

D. Calculations Related to Composite Grassmann Matrices 

A composite Grassmann matrix p( m ) is a generator matrix generating p( m ) g (L), and we 
denote the set of composite Grassmann matrices by (L). A composite Grassmann matrix P^" 1 ) = 

[Pi ■ • • P m ] G A^S (L) generates a plane P( m ) = (Pi, • • • , P m ) G (L), where Pi, , P m are the 
generator matrices for Pi, • • • , P m respectively. Note that the generator matrix Pj for a plane p G p (L) 
is not unique. The composite Grassmann matrix P*™) G A4^p (L) generating p( m ) g £?n^ (L) is not 
unique either: let U^ m ^ is an arbitrary x block diagonal matrix whose i th (1 < i < m) diagonal 
block is a p x p orthogonal/unitary matrix (w.r.t. L = R/C respectively); if p( m ) generates p( m ), then 
pWfjW generates p( m ) as well. View Ai^J (L) as a Cartesian product of m many MS (L). Then the 
isotropic measure \i on Aii^J (L) is simply the product of Haar measure on each composed Ain} p (L)'s. 
We say a p( m ) g -Mn"p (L) is isotropically distributed if the corresponding probability measure is the 
isotropic measure p. 

Note now that we are interested in quantifying E [log |l + cP^P*" 1 ^]], for a constant c G M + and 
isotropically distributed P^" 1 ) G (C). The asymptotic behavior of this quantify is derived by random 

matrix theory techniques. 

Theorem 3: Let P( m ) G M?"™} (C) be isotropically distributed. For all positive real numbers c, as 
n, m — >■ oo with — — > m G R + , 



lim -E [log|l + cP (m) P (m)t |] 

(n,m)— >oo 77, 

/ 1 _ \ _ / 1 \ Tic fh) 
= log ( 1 + cm — -JF (c, m) J + m log ( 1 + c — -JF (c, m) J , (10) 



where 



F(z,m) = ((l + \-z) l/2 - (l + X + z) 



1/2 X 2 



A+ = (l + ^/l/rnj and A" = (l - 

Proof: Let H G C nxm be a standard Gaussian matrix. Let P*" 1 ) G i (C) be isotropically 
distributed. As n, m — > oo with a positive ratio, the eigenvalue statistics of p( m )p( m )t anc i i-HH^ are 
asymptotically the same. Indeed, the Raleigh-Ritz criteria shows that the discrepancy between correspond- 
ing eigenvalues of these two matrices is bounded (multiplicatively) above and below by the minimum and 
maximum column norms of — HH^, both of which converge to one almost surely. Thus, 



lim -E [log|l + cP (m) P (m)t |] = lim -E 

(n,m)— >ooTi (n,m)^ooTl 



log 



I + c^lHHt 

n m 



Now, it is a basic result in random matrix theory [23, Eq. (1.10)] (also see Appendix lAl) that the empirical 
distribution of the eigenvalues of ^HHt converges to the Marcenko-Pastur law given by 

V / (A-A-) + (A+-A) + 
dy. x = (1 -fh)H{\) + d\ 
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almost surely, where (z) = max (0, z). Thus, 

lim -E [log|l + cP (m) P (m)t |] -> / log (1 + cmA) • d^x 

(n,m)— >oofl J 

since log(l + cm A) is a bounded continuous function on the spectral support. The resulting integral is 
evaluated in [24], and the proof is finished. 

■ 

For finite n and m, we substitute m = — into (flOl) to approximate -E [log |l + cP^P*"^]] . 

IV. Suboptimal Strategies and the Sum Rate 

Given finite rate feedback, the optimal strategy (Q~|) involves two coupled optimization problems: one 
is with respect to the feedback function cp and the other optimization is over all possible covariance 
matrix codebooks. The corresponding design and analysis are extremely complicated, and instead we 
study suboptimal power on/off strategies. Motivated by the near optimal power on/off strategy for single 
user MIMO systems [9], [10], we assume: 

Tl) Power on/off strategy: The ith user covariance matrix is of the form Sj = -P on QiQ!> where P on is 
a fixed positive constant to denote on-power and Qj is the beamforming matrix for user i. Denote 
each column of Q, an on-beam and the number of the columns of Qj by Si (0 < Si < Lt), then 
Q^Qi = I Si . Here, is the number of data streams (or on-beams) for user % (s^ = implies 
that the user i is off). The user i with Sj > is referred to as an on-user. 
T2) Constant number of on-beams: Let s = J2^ =1 Sj, the total number of on-beams, be constant 
independent of the specific channel realization for a given SNR. With this assumption, P on = p/s. 
Remark 1: Using a constant number of on-beams is motivated by the fact that it is asymptotically 
optimal to turn on a constant fraction of all eigen-channels as Lt, Lr — > oo with a positive ratio, see 
[10] which also demonstrates the good performance of this strategy. While the number of on-beams is 
independent of channel realizations, it is a function of SNR. Realize though that typically SNR changes 
on a much larger time scale than block fading. Keeping the number of on-beams constant enables the base 
station to keep the feedback and decoding processing from one fading block to another, and therefore 
reduces complexity of real- world systems. 

These two assumptions essentially add extra structure to the input covariance matrix S. Given this 
structure, we propose a joint quantization and feedback strategy in Section IIV-B[ which we term "general 
beamforming strategy". As we shall see in Section ITV-C I antenna selection can be viewed as a special case 
of general beamforming. Due to the simplicity of antenna selection, we next discuss its main features. 

A. Antenna Selection 

The antenna selection strategy is described as follows. Index all NL T antennas by i (i — 1, • • ■ , NL T ). 
Then 

NL T 

i=i 

where hj is the i th column of the overall channel state matrix H (defined in Section HU), and X; L is the 
Gaussian data source corresponding to the antenna i. Power on/off assumptions (Tl) and (T2) imply that 
either E [Xf] = | or E [Xf] = 0. Indeed, for a specific user, its input covariance matrix can be written as 
|QQ^ where Q is obtained from intercepting some columns from the identity matrix. Given a channel 
realization H, the base station selects s many antennas according to 

Fl) Antenna selection criterion. Sort the channel state vectors h/s increasingly according to their 
Frobenius norms such that ||h( 1:iVLT ) || < ||h( 2: jvL T )|| < •■■ < ||h(Arz, T :jvz, T )||> where ||-|| denotes 
the Frobenius norm. Then the antennas corresponding to h( A r LT _ s+1:AfLT ), ■ ■ • ^nlt.nlt) are 
selected to be turned on. 
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To feedback the antenna selection information, totally log 2 ( N1 ^ T ) many bits are needed. The corresponding 
signal model then reduces to 



Y 



Eh 

k=l 



Let h 



(NL T -k+l:NL T ) 



rikik where n k 



(NL T -k+l:NL T ) x k + w. 

and £ fc 



l(NL T -k+l:NL T ) \ 



l{NL T -k+l:NL. 



S := [£i • • • Then the sum rate X is upper bounded by 



X:=E H 
< E H 



1/ 



where 



log 
log 

7] :-- 



I Lr + -Ediag [n\, 



n 



3 a 



s 



,)/njfe. Define 



(11) 



-E r 



E 

,/c=i 



(12) 



and the inequality comes from the concavity of log | ■ | function [25] and the fact that E and n 2 



■ n 



21 1 



are independent [26, Eq. (3.9)]. We refer to rj as the power efficiency factor as it describes the 



power gain of choosing the strongest antennas against random antenna selection: if antennas are selected 
randomly with the total power constraint increased to pr], the average received signal power is the same 
as that of our antenna selection strategy. 

Based on the upper bound (fTTI) . the sum rate can be approximately quantified. Note that ||hj||'s are 



i.i.d. r.v. with PDF / (x) 



i 



(L R -1)\ 



X 



l e x . An application of ([3]) provides an accurate approximation of 



rj. Furthermore, note that E G i (^) I s isotropically distributed. Substituting c = ^i]Lr and fh = 
into (flOl ) estimates the upper bound (fTTT) . Simulations in Section [V] show that this theoretical calculation 
gives a good approximation to the true sum rate. 



B. General Beamforming Strategy 

In this subsection, we propose a power on/off strategy with general beamforming: the base station 
selects the strongest users, jointly quantizes their strongest eigen-channel vectors and broadcasts a common 
feedback to all the users; then the on-users transmit along the fedback beamforming vectors. 

Remark 2: We consider this suboptimal strategy for its implementational simplicity and tractable per- 
formance analysis. The user selection is only based on the Frobenius norm of the channel realization, 
which does not require complicated matrix computations. Note that only a few users are chosen among 
a large number of total users available and that singular value decomposition is performed only after 
user selection in our strategy. The computation complexity is much lower than a user selection strategy 
depending on eigenvalues of the channel matrices. For each selected user, only the strongest eigen-channel 
is used. This assumption imposes a nice symmetric structure and makes analysis tractable. 

In particular, for transmission, along with assumptions Tl) and T2), we add one more constraint: 

T3) There is at most one on-beam per user, that is, Sj = or = 1. Note that this also implies that 
the total number of on-streams s is the same as the number of on-users. 
For a given channel realization H, we select the on-users according to 

F2) User selection criterion. Sort the channel state matrices H/s such that ||H( 1:A f)|| < ||H( 2 :ao|| — 

■ • • < ||H(jv:JV) ||> where ||-|| is the Frobenius norm. Then the users corresponding to H(N-k+i:N), 

■ ■ ■ , H( A r. iV ) are selected to be turned on. 

After selecting the on-users, the base stations also quantizes their strongest eigen-channel vectors. Con- 
sider the singular value decomposition Hnv-fc+i : jv) — UfoA^-V], where the diagonal elements of A fc are 
decreasingly ordered. Let v fe be the column of V fe corresponding to the largest singular value of A fc . Then 
the matrix 

V:= [vr-vJeM^ (C), 
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where M.^} 1 (C) is the set of composite Grassmann matrices (defined in Section IIII-D|) . In order to 



quantize V, the base station constructs a codebook B C Ai^l 1 (C) with \B\ 
feedback bits available for eigen-channel vector quantization. Note that random codebooks are asymp- 
totically optimal in probability (Theorem^), we assume that B is randomly generated from the isotropic 
distribution. For a given eigen-channel vector matrix V, the base station quantizes V via the 
F3) Eigen-channel vector quantization function 



if (V) = arg max ^ W k hh 



k=l 



(13) 



where b fe is the k th column of B G B. Indeed, let p( m ),Q(' m ) <= Q^ x (C) be the composite 
planes generated by V and B respectively. Then (fT3l) is equivalent to the quantization function 
on the composite Grassmann manifold defined in ©. 
After quantization, the base station broadcasts the user selection information (requiring log 2 ( ) many 
feedback bits) and the index of eigen-channel vector quantization to the users. The corresponding signal 
model is now reduced to 



Y = H(jv-fe+i:iv)bfcXjt + W 



k=l 



W, 



k=l 



where hi 



H 



(N-k+ 



i:Ar)bfc is the equivalent channel for the on-user k. 



The point is that the joint quantization (fT3l) efficiently employs the feedback resource. It differs from 
an individual quantization where each is quantized independently: separate codebooks Bi, ■ ■ ■ ,B S are 
constructed for quantization of Vi , • • • , v s respectively, and the quantization function is 



n 

k=l 



arg max 

beB fe 



vtb 



where Y[ is me Cartesian product. Indeed, individual quantization is a special case of joint quantization 
obtained by restricting the codebook to be a Cartesian product of several individual codebooks. It is thus 
obvious that joint quantization achieves a gain tied to that of vector over scalar quantization. 

Certainly the sum rate depends on the codebook. Still, when random codebooks are considered, it is 
reasonable to focus upon the ensemble average sum rate. Let h k = n k £ k and H = [£i •••&], where 



and £ fc = h k /n k . Then the average sum rate satisfies 



Z 



rand 



E, 



< E H 



log 
log 



Il b + -Sdiag [n{, 



s 



(14) 



The inequality in the second line follows from Jensen's inequality and the 
< s are independent and isotropically distributed. Furthermore, £ fc 's are 



where r\ is defined in 
next fact. 

Theorem 4: £fc's 1 < k 
independent of n^s. 

Proof: Consider the singular value decomposition of a standard Gaussian matrix H = UAV^. It is 
well known that U and V are independent and isotropically distributed, and both of them are independent 
of A [26, Eq. (3.9)]. Now let Uj.A fc V^ be the singular value decomposition of ~H.(N-k+i-.N) I < k < s. 
Since we choose users only according to their Frobenius norms, the choice of Hnv-fc+iriv) only depends 
on A but is independent of XJ k and V fc . The independence among XJ k , V k and A k still holds. Note that 
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the equivalent channel vector h fc = U k A k \lb k = Uk$k n k where A k \ k b k = i k n k . Since h k depends 
only on V^, is independent of £ k . Thus £ k = XJ k l- k is isotropically distributed [27]. Now the fact that 
Ufc's are independent across k implies that £ fc 's are independent across k [27]. 

Next realize that n k is only a function of A k and V k , both of which are independent of XJ k . Ufc's are 
independent of n k s (and isotropically distributed). It follows that £ fe 's and n k s are independent [27]. ■ 



The calculation of Eg [rj] proceeds as follows. To simplify notation, let H(&) = H 



(N-k+l:N) 



and nf, 



|H (fc )|| . Let nj 



s Sfc=i E • Let \ k j (1 < k < s and 1 < j < Lr) be the decreasingly ordered 



eigenvalues of H^Hj^, and Q = E 



A 



1 



B, let v fc be the k ttl column of V, and b k be the k x 



, defined in Lemma [2] For a quantization codebook 
1 column of B = ip (V) e B. Define 



7 



E 



S,V 



E 

.fc=i 



(15) 



Eg [rj\ is a function of 7. 

Theorem 5: Let the random codebook £> follows the isotropic distribution. Then 



Es [77] = E B>V 



E 

,fc=i 



■iR 



7 -( 1 + - 

S 



7 1-C1 



«(.). 



(16) 



The proof is contained in Appendix [Ql 
To make use of this formula, the constant d can be well approximated by (i/l r using our results in 
Section Ull-Bl and n 2 ^ can be estimated by ([3]). Let R q be the quantization rate on eigen-channel vector 
quantization. As a function of R q , an approximation of 7 is provided at the end of Section IIII-CI Put 
together we have our estimate of Eg [rj\. And to estimate the average sum rate, we only need to substitute 
the value of E B [77] into the bound ([141) and then evaluate it via (flOl) . 



C. Comments 

1) Choice of s: The number of on-beams s should be chosen to maximize the sum rate keeping in 
mind that it is a function of SNR p. Given that our proved bound accurately approximates the sum rate 
(when s <C N and R q are large enough), the optimal number of on-beams s* can be found by a simple 
search. 

2) Antenna Selection and General Beamforming: The antenna selection can be viewed as a special 
case of general beamforming where a beamforming vector has a particular structure - it must be a column 
of the identity matrix. Note that general beamforming requires total feedback rate log 2 f N ) + R q bits 
while antenna selection needs log 2 ( N1 ^ T ) = log 2 ( N ) + slog 2 -^T + O (4) bits for feedback. Antenna 
selection can be viewed as general beamforming with R q = slog 2 L T . One difference between antenna 
selection and general beamforming is that antenna selection does not assume one on-beam per on-user 
(Assumption T3)). In antenna selection, multiple antennas corresponding to the same user can be turned 
on simultaneously. As a result, the sum rate achieved by antenna selection is expected to be better than 
that of general beamforming with R q = slog 2 L T . This is supported in our simulations. 



V. Simulations and Discussion 

Simulations for antenna selection and general beamforming strategies are presented in Fig. [T] and [2] 
respectively. Fig. \T\ shows the sum rate of antenna selection versus SNR. The circles are simulated sum 
rates, the solid lines are simulated upper bounds ([Til) , the plus markers are the sum rates calculated by 
theoretical approximation, and the dotted lines are the sum rates corresponding to the case where there 
is no CSIT at all. In the simulations, the value of s is chosen to maximize the sum rate according to our 
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theoretical analysis. Fig. [2] illustrates how the sum rate increases as the eigen-channel vectors quantization 
rate R q increases. Here, the s is fixed to be 4. The dash-dot lines denote perfect beamforming, which 
corresponds to R q = +00. The circles are for our proposed joint strategy, the solid lines are simulated upper 
bounds (fTTT) . the up-triangles are for antenna selection and the down-triangles are for individual eigen- 
channel vectors quantization (recall the detailed discussion in Section IIV-BI) . We observe the following. 

• The upper bounds (fTTT) and (fl4l) appear to be good approximations to the sum rate. 

• The sum rate increases as the number of users N increases. Fig. Q] compares the N = 32 and iV = 256 
cases. Our analysis bears out that increasing iV results in an increase in the equivalent channel norms 
according to extreme order statistics. The power efficiency factor increases and therefore the sum 
rate performance improves. 

• The loss due to eigen-channel vector quantization decreases exponentially as R q increases. According 
to Theorem [H the decay rate is s ^_^ Rq_- When L T is not large (which is often true in practice), a 
relatively small R q may be good enough. In Fig. |2l as L T = 2 and s = 4, R q = 12 bits is almost as 
good as perfect beamforming. 

• Our proposed joint strategy achieves better performance than individual quantization. Note that the 
effect of eigen-channel vectors quantization is characterized by a single parameter 7. Joint quantization 
yields larger 7, larger power efficiency factor, and therefore better performance. 

• Antenna selection is only slightly better than general beamforming with R q = slog 2 L T . As has been 
discussed in Section ITV-Cl the performance improvement is due to excluding the assumption T3). 



Antenna Selection General Beamforming : s=4 




SNR(dB) Feedback Rate on Quantization R (bits) 



Fig. 1. Antenna Selection: Sum Rate versus SNR. Fig. 2. General Beamforming: Sum Rate versus R q . 



VI. Conclusion 

This paper proposes a joint quantization and feedback strategy for multiaccess MIMO systems with 
finite rate feedback. The effect of user choice is analyzed by extreme order statistics and the effect 
of eigen-channel vector quantization is quantified by analysis on the composite Grassmann manifold. By 
asymptotic random matrix theory, the sum rate is well approximated. Due to its simple implementation and 
solid performance analysis, the proposed scheme provides a benchmark for multiaccess MIMO systems 
with finite rate feedback. 
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Appendix 

A. Random Matrix Theory 

Let H G L nxm be a standard Gaussian random matrix, where L is either R or C. Let Ai, • • • , A n be 
the n singular values of ^HHl Define the empirical distribution of the singular values 

1 



PLn,x(V = -\{j-- A, <A}|. 



As n, m — > oo with 



m 
n 



m G R + , the empirical measure converges to the Marcenko-Pastur law 

dX 



dfi x = (l-fh) + 5(X) + 



m^J (x - x-) + (x+ - xy 



2tvX 



(17) 



almost surely, where X ± = ^1 ± and (x)~ 

is [23, Eq. (1.10)]). Define 



K = 



max (x, 0) (A good reference for this type of result 

if (3 > 1 
A- if /3 < 1 ' 



Consider as well a linear spectral statistic 



l HH t) = I ][> (A,) . 
' i=i 



If g is Lipschitz on [X t , A + ] , then we also have that 

lim g (-mA = fg(X) d^ 

almost surely, see for example [28] for a modern approach. 

The asymptotic properties of the maximum eigenvalue will figure into our analysis. Denote the largest 
eigenvalue by Ai. 

Proposition 1: Let n, m — > oo linearly with ^ — > m G K + . 

1) Ai — > A + almost surely. 

2) All moments of A x also converge. 

The almost sure convergence goes back to [29], [30]. The convergence of moments is implied by the tail 
estimates in [31]. A direct application of this proposition is that for \/A n C M. n such that fi n ^\ (A n ) — > 0, 
E A [Ai, A n ] -> 0. 

Theorem 6: Let H G L nxm (L = M/C) be standard Gaussian matrix and Aj be the i th largest eigenvalue 
of ^HHt. 

m 

1) Let g(X) = f (A) • X[a,A+] (A) for some a < A + where / (A) is Lipschitz continuous on [A - , A + ] and 
X[a,x+] (A) is the indicator function on the set [a, A + ], then as n, m — > oo with ^ — > m G M + , 



almost surely and 



2) For Va G (A t - , A+), 



lim / fir (A) • dfj, n>x (A) = / g (A) • 

n,m)->oo J J 
n 



lim E A 

(n,m)— »oo 



i=l 



5- (A) • dfx x - 



-\{X t : X t >a}\ 
n 



A+ 



dfj,> 
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3) For Vr G (0, min (1, fh)), 



lim E 

(n,m)— +oo 



\l<i<nr 



/ A • dfi X , 
J a 



where a G (A , A + ) satisfies 



Proof: 



f X+ 

= / ^A- 

</ a 



1) Though g (A) is not Lipschitz continuous on [A^,A + ], we are able to construct sequences of 
Lipschitz functions g£ (A) and g k (A) such that (A)'s are Lipschitz continuous on [A;T,A + ] 
for all k, g£ (A) > g (A) and g k (A) < g (A) for A G [Af , A + ] , and ^ (A) — > g (A) pointwisely as 
— > oo. Due to their Lipschitz continuity, ^ (A)'s are integrable with respect to Then we have 

lim lim / g~ (A) • dfi riiX (A) < lim / g (A) • d/i„ jA (A) 

< lim lim / (A) • d^ n>x (A) , 

fe^oo(rt,m)^oo J 

lim lim / g k (A) • d/^A (A) = lim / g~ (A) • dfi (A) = / g (A) • d/x (A) 



while 



and 



lim lim / g+ (A) • d^ x (A) = / g (A) • dfi (A) 



almost surely. This proves the almost sure statement, and the convergence of the expectation follows 
from dominated convergence. 

2) follows from the first part upon setting g (A) = X[a,\+] (A). 

3) Since a G (A - , A + ), there exists an e > such that (a — e, a + e) C (A - , A + ). For any 8 > 0, define 
the events 

: Xi > a + e}\ 



1 n,a+e 



A n a/ -e — S A : 



n 



|{Aj : Aj > a - e}\ 



n 



< T 



> r 



B 



n,a+e,S 



and 



B r , 



{ A : - V Ai - / A • < <5 1 



According to the first part of this theorem, it can be verified that Ve > 0, as (n,m) — > oo, 

s) -> 1, and Hn,\(B nta ^s) -> 1. Then for 
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sufficiently large n, 



> E> 



(a) 

> E A 



-J2 X 

n J—^ 

i<nr 

~ / , A», ^n,a+e H B na+e s 

n t-^ 

i<nr 

~ / . Aj, A na + e Pi B n a + e s 
n 



(6) 

> E A 



Ai>a+e 
A+ 

A • d/i A - 5, ^n,a+e 



a+e 



n,a+e 



> 



A+ 



A • d/iA — <5 1 (1 — <5) 



(18) 



where E[-,A] denotes the expectation operation on the measurable set A, (a) and (b) follow from 
the definition of A n ^ a+e and B n ^ a+ ^$ respectively. Similarly, when n is large enough, 



E, 



< E, 



< E A 



T7 ^ 

i<nr 

/ , Aj, A n a—e 

n 

i<nr 

EAj, A na - 



+ E A [Ax, A^ a _ e ] 
+ <5 



A;>a— e 
,A+ 



< 



/ A • d/i A + <5, A U:a _ e n B 

n,a—e,o 

J a—e 

+ E A [Ai, A n , a _ e n SS, _ M ] + 5 
^ A • d/i A + <5 j Li n .x {A n<a _ e n B nA _^ s ) + 25 



A+ 



< / A • d//A + 3(5, 



(19) 

where (c) and (d) are an application of Proposition [TJ Now let 8 [ and then e J, 0. Then we have 
proved that 



lim E 

(n,m)—+ oo 



1 E * 

n \ f— ' 



^l<i<71T 



A+ 



A • d/^A- 
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B. Proof of Lemma [7J 

As the first step, we compute the asymptotic distribution and expectation of X( n . n y It can be verified 
that 



1 - F x (y) 



'L-l 



fx (x) dx = e y 



y 1 - 



,i=0 



y 



and for Va > 0, 



r+oo 

/ 1-F x (y)dy 

J a 

'L-l j L-2 . j 



,i=0 
'L-l 



i=0 



i=0 



'ME—"' 



For < £ < +oo, define 



Then 



,i=0 



R(t) 



!t_ °° (1 - ^ (y)) cig/ 
1 - Fx (t) 



Now let 



and 



lim fl(t) = lim Z " i =° : * ! — = 1. 

Z^i=0 



a n = inf < x : 1 — Fx (x) < — 

n 



b n — R(a r 



EL-l L—i j 
j=Q j\ a n 

-L-l 1 i 
A=0 i\ a n 



It can be verified that a r 



/ j \ — ft J^r 

-oo, and that b n ^— ► 1 by (|20l ). Furthermore, 

lim n [1 - F x (a n + xb n )] 

n—>oc 

I- F x {a n + xb n ) 



lim 

n^oo 1 - Fx (a n ) 



lim e 

n— >oo 



E 



i fa ) 

i=0 i! 



Therefore, for all x G K and sufficiently large n, 

P (X (n:n) < a n + 6 n x) 



1 n (1 - F x (a n + b n x)) 

n 



exp ( n • log ( 1 e x (1 + o (1)) 



exp(-e- (l + o(l))) 
^ exp (— e~ a ') . 



(20) 



(21) 
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This identifies the limiting distribution, and the tail is of sufficient decay to conclude that 



lim E 

n— >+oo 



xde 



Mi- 



Given the law of the first maxima Xi n - n ), the distribution and the expectation of the k maxima follow 
easily. With z n = a n + b n x, 



P (X (n _ fc+1:n) < z n ) = (f) (1 " Fx (*,))* i**"' 
According to (ED, (") (1 - F x (z n )) 



X \ Z n) ■ 



t n-^oo , e _ tx ^ n->oo e _ e -,_ Thus 



fc-1 



7-1 i X(xi—k+X:n) ®"n . \ n — x / _ ,. , V * 1 

/- } ! ; < ./• ! — > exp (-e ) -j-e 



tx 



Denote it by (x). The corresponding PDF is given by 



h k (x) = H h (x) = e 



t=o 



-kx 



(k~l)\ ■ 

Define fi k = f_™ xh k (x) dx. It can be verified that Evaluating ii x k gives an iterative formula 

1 



(22) 



(k-l)\J_ 



xe kx e e x dx 



CO 

-y r+oo 



(k-iy. 



(fc-2)! 



xe 



-(k-l)x e -e ' dx 



1 



(*-i)!y-oo 
i 



e (fe 1)x 'e e x dx 



Mfc-l 



where the last step follows the fact that ^ e - ^" 1 ) exp (— e~ x ) is the asymptotic pdf of (fc — l) th maxima. 
Therefore, 



lim E 

n— >+oo 



l^k = 



fc-1 



i=l 



and so also, 



lim E 



Y2k=l X( n -k+l:n) Sa r. 



^fc = - 



S — I 



k=l 



i=l 



C. Proof of Theorem [7] 

The proof of Theorem Q] is similar to that of Theorem 2 in our earlier paper [11]; the difference being 
that the composite Grassmann manifold is of interest here while the "single" Grassmann manifold is the 
focus in that work. The key step of this proof is the volume calculation of a small ball in the composite 
Grassmann manifold. Given the volume formula, the upper and lower bounds follow from the exact 
arguments in [11]. 



18 



Technical Report 



A metric ball in Q^j (L) centered at G (L) with radius 5 > is defined as 

B p[m) (5) := {Q^ e Gi m J (L) : d c (P< m >, Q (m) ) < 5} . 

The volume of B P ( m ) (5) as the probability of an isotropically distributed Q^> G (L) in this ball: 

fi(B p{m) (<?)):= Pr (QWeBpw (5)). 

Since /i (B P ( m ) (5)) is independent of the choice of the center P*™), we simply denote it by /i*" 1 ) (5). We 
have: 

Theorem 7: When 8 < 1, 

r m (| + 1) 

r (m| + 1) 

where c niPtP! p and t are defined in Lemma HI 

Proof: Let us drop the subscript of Cn, PtP ,j3 during the proof. In [11], we proved that for a single 
Grassmann manifold, /i^ (d 2 c < x) = fi^ (\/ r x) = ex? (1 + (x)) when x < 1, and it can be verified 
that 

dfi (d 2 c < x) = -cx^~ l (1 + (x)) ■ dx. 
By the definition of the volume, d^ (x) /dx is a convolution of dp, (x) /dx and d[i (x) /dx. So, 



» {m) (*) = ^UC/, (23) 



;r 



f ^cV^ 1 (a; - r)* -1 (1 + (r)) (1 + O {x - r)) dr 



( = } ^cV- 1 1 j,*" 1 (1 - yf' 1 (1 + (xy) + O (x (1 - y))) rfy 

= I cV r(l) (1 + ° (g)) ' 

where (a) follows from the variable change r = xy. A calculation produces 

r C* + l) P ^ + ll 

^ (2) K < x) = w (i+o (x)) . 

By mathematical induction, we reach (|23l) . Note that 5 < 1 is required in every step. ■ 
Based on the volume formula, an upper bound on the distortion rate function D* (K) on the composite 
Grassmann manifold 

2 / 2 \ T^t fm| + l) -i 2io g2 K 

W 5 ^ r (s) + (1 + ° (1)> 

is derived by calculating the average distortion of random codes (see [11] for details). Furthermore, by 
the sphere packing/covering argument (again see [11] for details), the lower bound 

mt T~^t (mt + 1) -2 2io 82 K , 

Cn,;,^ 2 " T ( 1 + °( 1 ))^ D (#) 



mt + 2 Tt(t + 1) 
is arrived at. Theorem Q] is proved. 
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D. Proof of Theorem \5\ 

The key step is to prove that E B>H 



t 



diag 



s— 7 



s— 7 



singular value decomposition H(&) = U fc A fc V{,. Let V 
of all the columns of \ k except v&. Let V 
definition of 7 in (fT3T) . Then the fact that 



E 



, , . , . . , where V fe is from the 

[v fc V fc ] where V k G C LtX(Lt_1) is composed 
Vi • • • v s ] . Recall our feedback function ip (V) in (fT3l) and 

v£b fc b+v fc 



2 

s 



is implied by the following lemma. 

is 

the isotropic distribution. Let B = <p (V) where <p (•) is given in (TT3T ) and 7 is given by (fT5T ). Then 



Lemma 5: Let V G -M^ x be isotropically distributed and B <Z M Lrl h& randomly generated from 



E B , V [VtBB+V] = ll,. 

Proof: Let Z = E e , v [V f BBtV] . For any 9 G [0, 2tt), let A k = diag [l, • • • , 1, e jd , 1, • ■ ■ , l] be 
obtained by replacing the k th diagonal element of I with e^ 6 . It can be verified that VA fc G M.l T 
isotropically distributed, and ip (VA fc ) = ip (V) = B. We have 



is 



E 



B.VA* 



At.V t BB t VA i . 



= A+E^v [V+BBtV] A A 
= A[ZA k , 

where the first equality is obtained by changing the variable from V to YA k , and the second equality is 
obtained by replacing the measure of VA^ with the measure of V. Then (Z) fc • = e~i 9 (Z) k ■ for j ^ k, 
which is only possible if (Z) fc ^ = 0. Therefore, Z is a diagonal matrix. 

Now let P G W xs be a permutation matrix generated by permutating rows/columns of the identity 
matrix. Let BP = {BP : B G £>}. Then VP G and BP G A^J.i are isotropically distributed. 

It can be verified that ipsp (VP) = BP = ips (V) P, where the subscript ip emphasizes the choice of 
codebook. Then, 



->BP,V 



P f E 
P + Ebp iV 

P^b.v 

P f ZP, 



:vp) f Vt3F (vp) VBF (vp) t (vp) 

vW (vp) VBP (vp) f v 

VVs (V)PPVb (v) f v 
VVb (V) ^ (V) f V 



where the first equality is obtained by variables change, and the second and fourth equality follows from 
measure replacement. It follows that (Z) if = (Z) ■ ■ for 1 < i,j < s. 

Finally, Z = ^1 follows from the fact that tr (Z) = E [tr (V+BBtV)] =7. ■ 

We evaluate 



E 



V[b fe b[V 



E 
E 



vlb k b\v k 

vib fc biv fc 



E 
E 



v!b fc bIV fc 
Vtb fc blv fc 



For any unitary matrix U r G C^ Lt 1 ^ x( - Lt [v fc ,V fc U] is also isotropically distributed. Employ the 
method in the proof of Lemma [5] to find that 



E 



v[b fc b[V 



E 



U, 
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and E 



vib fc biv fc 



Vlb fe b+V fc 



u. 



Therefore, E 



^andE 



tr ( Vlh k b{V k 



Finally, 



and E 



vib fc btv fc 



1. Hence, c 



s— 7 
s(L T -l) 



-E 



E 

.fc=i 



cI Lt _! for some constant c. Note that E v\,h k h\v k 



and E 



V[b fc bj!V fc 



diag 



s ' s(L T -l) ' 



s(Lt-I) 



J2 E B,n W (H (fc) b fc b^H 



k=\ 



sL 



R 



sL 



k=l 



E 



H 



R 



E 7^ e h K«l + 



fc=i 



s _ ^ (1 — Ci) Eh 



(fc) 



Lr - 1 



ifl \ S 8 L T -\ 



(■)■' 



where the third line follows from the fact that A k is independent of V fc and b fc . 
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